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Description 

BACKGROUND OF THE INVENTION 
5 Field of the Invention 

[0001] The invention relates to an image processing apparatus and. nnore particularly, to a technique in which not 
only characters but also blank spaces are recognized from input image information and the result of the recognition is 
generated in a desired form. 

10 

Related Background Art 

[0002] A requirement for electrical converting news in a newspaper, a book, or the like and for filing or storing as a 
data base and thereby for enabling them to be efficiently used is rapidly increasing. The development of a character 
15 recognizing apparatus which can input a printed document at a high speed and a high accuracy is promptly being 
executed. 

[0003] There is an OCR (optical character reading apparatus) as one of such character recognizing apparatuses. 
[0004] Hitherto, for instance, an output form of a document recognized by the OCR is fixed and the user cannot 
designate the output form. That is, either one of the mode to display only character codes by ignoring the style of a 
20 desired document and the mode to generate blank and return codes in accordance with the style of a desired document 
has been predetermined. On the other hand, a paragraph cannot be recognized at a high accuracy 
[0005] However, the above conventional apparatus has the following drawbacks because the user cannot change 
the output form. 

25 (1) In the case of generating only character codes, blank and return codes are newly input in accordance with a 

desired document form after completion of the recognition, 

(2) If both of the blank and the paragraph are recognized from the beginning and are displayed by a display section, 
a problem such that it takes a time for recognition occurs. 

30 [0006] That is. there is a problem such that it takes a time to obtain a desired document so long as the output form 
cannot be designated. 

(3) Even in the case of recognizing a paragraph, there is a problem such that the paragraph cannot be recognized 
at a high precision. 

35 

[0007] JP-A-60163171 describes a sentence reader for analysing printed text organised in either horizontal or vertical 
lines to recognise paragraph columns or rows of characters. Sentences in a text are identified by adding an indicating 
sequence, which sequence is then read by a pattern reader. 

[0008] JP-A-3025692 describes a method wherein blank spaces between characters in a text are discriminated from 
40 longer spaces at the end of a paragraph by comparing them with the average pitch between detected characters. 

[0009] However, this document, which has been published after the priority date claimed for the present invention, 
does not clearly disclose how to determine the number of blank codes which must be assigned to represent said blank 
spaces between characters. 

[0010] Such assignment is performed by the present invention by the means and steps set out in appended claims 
45 1 and 5, respectively. 

[0011] Embodiments of the invention will now be described, with reference to the accompanying drawings, in which: 

Fig. 1 is a block diagram showing an embodiment of the invention; 

Fig. 2 is an explanatory diagram of a recognition mode selecting screen; 
so Figs. 3A and 3B are diagrams showing an example of a document image; 

Fig. 4 is a diagram showing main data areas; 

Fig. 5 is a flowchart of a recognition result outputting section; 

Fig. 6 is a detailed flowchart of a character train output section; 

Fig. 7 is a diagram showing an example of a document image; and 
55 Fig. 8 is a flowchart of a paragraph detection. 

[0012] The invention can be applied to a character recognizing apparatus or to a character processing apparatus 
having a character recognizing apparatus. The invention can be applied to an apparatus comprising one equipment 
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or to a system comprising a plurality of equipments. 

[0013] Fig. 1 is a block diagram of a character recognizing apparatus showing an embodiment of the invention. 
Reference numeral 1 denotes a CRT display section for displaying document innage data by raster scanning; 2 a video 
RAM (VRAM) to store pattern development information of one screen of the CRT display section 1 ; 3 a display controller 

5 for controlling the development of the pattern into the VRAM 2 and the pattern reading to the CRT display section 1 ; 
4 a microprocessor (MPU) for integratedly controlling the respective sections; 5 a main memory comprising an ROM 
having therein a control program and an RAM serving as a work memory for data processes; 6 a character recognizer 
for matching a character image and a pattern and for generating a character code; 7 an external magnetic disk device 
into which the result of the discrimination and a candidate character are written; 8 a pointing device (PD) which also 

10 serves as indicating means of the invention and designates an arbitrary position of the CRT display section 1 ; 9 a 
keyboard; and 1 0 an I/O bus for connecting each block with the MPU 4. 

[0014] In the block diagram of Fig. 1, the explanation has been made with respect to the character recognizing 
apparatus. However, a recognizing function can be also added to a character processing apparatus as mentioned 
above. In such a case, a document processing program is also stored in the ROM. The MPU is also provided in the 

IS character recognizer 6 and the processes are executed in parallel with the document processing control by the MPU 4. 
[001 5] When an icon to execute the recognizing process is displayed on the screen during the document edition and 
the icon is designated by the pointing device (PD) or the like, the recognizing process is started and a menu to select, 
for instance, an output form of the result of the recognition is displayed. Or. when an icon to input an image from a 
scanner is designated, a menu as shown in Fig. 2 can be also automatically displayed. 

20 [0016] Fig. 2 shows a menu screen which has been displayed on the CRT display section 1 and is used to arbitrarily 
select the output form. Through the menu screen, the user can designate a desired mode by using the PD 8 or KBD 
9 and can arbitrarily designate an output form. 

[0017] As a result of the recognition, the MPU 4 makes a character train of a form according to the output mode 
designated from the output data of the character recognizing apparatus. 
25 [001 8] Fig. 3A is a diagram schematically showing an example of a document image. A rectangle 31 which is shown 
as a hatched portion denotes a circumscribed rectangle of each character. Reference numeral 32 denotes a rectangle 
of a character line. 

[0019] The MPU in the recognizer 6 obtains a circumscribed rectangle of each character as shown at 31 in Fig. 3A 
from the document image by a cuttlng-out process of a character. Subsequently, the character image is recognized 
30 and a character code is obtained as a result of the recognition. Information indicating the character code of each 
character and the position of the circumscribed rectangle are stored into the RAM 5 of the apparatus as information 
as shown in Fig. 3B. 

[0020] In the character cutting-out process which is executed in the recognizer 6. a frequency distribution in the 
character line direction is first obtained with respect to the input document image and the position of the character line 
35 is detected. Subsequently by obtaining the character line and the frequency distribution in the vertical direction in the 
image of the character line portion, the left and right edges of each character existing in the character line are detected. 
Therefore, the first and last characters on each line of a document as an object of the character recognition are dis- 
criminated by the recognizer 6, so that the information regarding the first and last characters are also stored into the 
RAM 5. 

40 [0021] Fig. 4 is a diagram showing an example of main data format in the RAM 5 of the apparatus. 

[0022] Seven information for each character have been stored in the character information storage area 41 in ac- 
cordance with the appearance order of the characters. CD(i) (0 ^ i < the total number of characters) indicates a char- 
acter code obtained by the recognizer 6. On the other hand, as shown in Fig. 3B, CPX(i). CPY(i), CPW(i), and CPH(i) 
denote circumscribed rectangles; CPX(i) and CPY(i) indicate X and Y coordinates of the left upper edge of the circum- 

45 scribed rectangle; and CPW(i) and CPH(i) represent a width and a height, respectively. BL(i) denotes the number of 
blank characters which are insertable just before the character. LF(i) indicates the number of return characters which 
are inserted just after the character as a delimiter of the paragraph. Initial values of BL(i) and LF(i) are set to 0 and 
values are properly stored therein by processes, which will be explained hereinafter. Values are stored into fields other 
than BL(i) and LF(i) by the character recognizing apparatus. 

50 [0023] Information regarding each character tine is stored into the character line information storage area 42 in ac- 
cordance with the appearance order of the character lines. An address for the first character in the character information 
storage area for the characters included in the character line has been stored in LS(i). As shown in Fig. 3B, LPX(j), 
LPY(j), LPW(j), and LPH(j) denote rectangles of a character line and indicate the minimum rectangles incorporating 
all of the characters included in the character line. PT(j) denotes a character pitch of a character train constructing the 

55 character line. A mean value is obtained from the character positions and stored into PT(j). Similarly. WD(j) denotes a 
mean character width of the character line. 

[0024] Fig. 5 is a flowchart for outputting the result of the recognition. Such an outputting process is executed in 
accordance with the program in the ROM 5 under the control of the MPU 4. 
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[0025] First, in step S5-1 of detecting the right and left edges of a document, an X coordinate LX of the left edge and 
an X coordinates RX of the right edge of a target document image (for instance, refer to Fig. 7) are obtained. The left 
edge coordinate LX is obtained as a minimum value of the left edges LPX(j) (0 ^ j < NL) (NL denotes the number of 
character lines) of the character line rectangles in the character line information storage area 42. Similarly, the right 

5 edge coordinate RX is obtained as a maximum value of the right edge coordinates LPX(j) + LPWG) of the character 
lines. The left edge coordinate LX and the right edge coordinate RX of the document obtained are stored in the RAM 5. 
[0026] In a character height detecting step S5-2, a character height FH of the portion corresponding to a body in the 
document is obtained. To eliminate the character lines other than the body, a portion in which the character line rec- 
tangles having the same height as that of the preceding line most frequently continuously appear is extracted and a 

10 mean value of the heights of them is calculated and set as a character height FH. A discrimination regarding whether 
the adjacent character lines have the same height or not is performed as follows. 

1 -a<LPHG-1)/LPHG)<1 +a 

15 

where, 1 S j < NL equatio S5-2-1 

a is a constant to absorb errors and is set to, e.g., 0.2. Consequently, a head portion 7-1 in Fig. 7 is eliminated. 
[0027] In step S5-3 of detecting inten/als between character lines, an interval LS between character lines of the 
20 portion corresponding to the body is obtained. In a manner similar to the character height detecting section, a portion 
In which the same space between character lines as the space between preceding character lines more frequently 
continuously appears Is extracted and a mean value of them is calculated and is set loan interval LS between character 
lines. A discrimination regarding whether the adjacent intervals between the character lines are the same or not is 
performed as follows. 

25 

1 - a < {LPY(j + 1) - (LPYQ) + LPHO))}/ 
{LPY(j)-(LPY(j-1) + LPHa-1))} 



where. 1 ^ j < NL - 1 equation S5-3-1 

35 [0028] The character height FH and the interval LS between character lines are stored in the RAM. Thus, paragraphs 
7-2 and 7-3 can be discriminated In Fig. 7, which will be explained hereinbelow. 

[0029] In step S5-4 of detecting a blank, blanks corresponding to blank characters are found out in each character 
line and stored into the ROM 5 as BL(i) In the character information storage area 41 . 

[0030] In a character line j, it is calculated by which number of times the length {CPX(i) - (CPX(i - 1) -i- CPW(i - 1))} 
40 of blanks between a character i-1 and a character i (1 ^ i < the number of characters on the character line j) is larger 

than a character pitch PT(j) of the character line. An integer value which is closest to the number of times obtained is 

stored as BL(i) into the ROM 5. If there is no blank character, 0 Is substituted for BL(i) and is stored into the ROM 5. 

For a first character s on the character line. It Is calculated by which integer number of times a difference between the 

left edge coordinate LX of the document and CPx(s) is larger than the character pitch PTQ) by the MPU 4. A result of 
45 the calculation is stored as BL(s) into the ROM 5. 

[0031] In paragraph detecting step S5-5, a delimiter of the paragraph is found out and a predetermined value is 

stored into the ROM 5 as LF(i) of the delimiter character. 

[0032] The delimiter of the paragraph is determined by the MPU 4 under the following conditions. 

50 (1) A portion in which the interval between character lines is larger than the interval LS between character lines 

of the body is a turning point of the paragraph. 

(2) A character line having a blank in the head portion is the first line of the paragraph. 

(3) A character line in which the last character doesn't reach the right edge of the document Is the last line of the 
paragraph. 

55 

[0033] Practically speaking, the above items (1 ) to (3) are decided as a boundary of the paragraph in the case where 
either one of the following conditional equations is satisfied. 

[0034] With respect to the character line j (1 ^ j < NL), the above conditions (1 ) to (3) are expressed by the following 
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equations. 

(1 ') {LPYO) - (LPYO - 1 ) + LPH(j))}/LS > m 

5 

(2') BL(s) > 0 



(3') RX - CPX(e) > 2WDa) 

s denotes a character number of the first character of the character line j. e indicates a character nunnber of the 
last character of the character line j-1 , and m represents a constant such as 2. 

[0035] If the equation (2') or (3') is satisfied. 1 is substituted into l_F(e). If the equation (1 ') is satisfied, when a integer 
15 value which is closest to the value of {LPYQ) - (LPYQ - 1) + LPHQ - 1))}/(FH + LS) is equal to or larger than 1 , a value 
of (such an integer value + 1) is substituted into LF(e). If the integer value is equal to 0, 2 is substituted into LF(e). A 
processing procedure will be explained in detail with reference to Figs. 7 and 8. 

[0036] In character train outputting step S5-6, a desired character train is generated in the designated outputting 
nnode. 

20 [0037] As an outputting mode, either one of the mode to discriminate whether blanks in the line are recognized or 
not, the mode regarding whether a return code is inserted every line end of a document to be recognized or every 
paragraph thereof, and the mode to insert no return code can be designated by using the PD 8 or KBD 9 on the display 
screen of the CRT 1 as shown in Fig. 2. 

[0038] Fig. 6 is a detailed flowchart of the character train outputting section. A program to control the above processes 

25 is stored into the ROM 5 and the processes are executed under the control of the MPU 4. 

[0039] Processes In steps S6-3 to S6-1 4 are performed for NL character lines from the 0th line to the (NL - 1 )th line. 
In step S6-3, the character number of the first character of the character line j is substituted for i. The first character 
of the character line can be known from LSQ) of the character line information storage area. Processes in steps S6-5 
to S6-12 are performed with respect to each character included in the character line j. A discrimination regarding 

30 whether the character is the last character of the character line or not is performed by branching the condition in step 
S6-4. First, if the outputting mode has been set to the mode to recognize the blanks in the line (step S6-5) and if there 
are spaces as many as the blank characters before the character i (step S6-6), the BL(i) blank characters are generated 
onto the CRT 1 in step S6-7. If NO in steps S6-5 and S6-6, nothing is generated and the processing routine advances 
to step S6-8. In step S6-8, the character code CD(i) as a result recognized by the recognizer is supplied to the display 

35 controller 3. After that, if the outputting mode has been set to the mode to recognize the return code at the end of the 
paragraph (step S6-9) and If the return code should be Inserted (step S6-10). LF(i) return codes are supplied to the 
display controller 3 in step S6-11 . After completion of the processes for all of the characters of the character line j, a 
check is made in step S6-13 to see if the outputting mode has been set to the mode to return every line end or not. If 
YES. one return code is unconditionally generated to the display controller 3 in step S6-14. If it is determined that the 

40 processes for all of the character lines have been executed in step S6-2. the above processing routine is finished. 

[0040] In the embodiment, all of the processes in steps S5-1 to S5-5 have been executed irrespective of the kind of 
outputting mode. However, some of the processes can be also skipped in accordance with the outputting mode. For 
instance, in the case of the mode in which there is no need to recognize the paragraph, it is unnecessary to perform 
the processes in steps S5-2 and S5-3. In the case of the mode in which there is no need to recognize both of the 

45 paragraph and the blanks in the line, the processes in steps S5-1 to S5-5 are unnecessary. By adding such a skipping 
control, the processing speed can be Improved. 

[0041] Paragraph detecting step S5-5 In Fig. 5 will now be described In detail. In this step, a delimiter of the paragraph 
is found out and a predetermined value is stored into LF(i) of the character corresponding to the delimiter. 
[0042] The delimiter of the paragraph is determined by the following conditions. 

50 

(1) A portion in which the interval between character lines is larger than the Interval LS between character lines 
of the sentence is a turning point of the paragraph. 

(2) A character line having a blank in the head portion is a first line of the paragraph. 

(3) A character line in which the last character doesnt reach the right edge of the document is a last line of the 
55 paragraph. 

[0043] Fig. 7 Is a schematic diagram of the input document. 

[0044] In an example of the document of Fig. 7, a character line 7-1 corresponds to a head portion, reference numeral 
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7-2 denotes a sentence, and reference nunneral 7-3 corresponds to, for instance, an itemized portion. The paragraph 
detecting section intends to detect the character line 7-1 , character lines 7-4 to 7-5. character lines 7-6 to 7-7, and 
character lines 7-8 to 7-9 by using a character line 7-10 as a paragraph. Since the character line 7-8 has characters 
until the right edge, the characters continue until the character line 7-9. 
s [0045] Fig. 8 is a detailed flowchart of the paragraph detecting section. A program to control the processes is stored 
into the ROM 5 and the processes are executed under the control of the MPU 4. 

[0046] The processes are sequentially started from the character line of the character line No. 1 for the NL character 
lines of the character Nos. 1 to NL (step S8-1 ). Processes in steps S8-3 to S8-12 are executed for each character line 
j. First, the character No. of the first character S of the character line j is stored into a register s provided in the MPU 

10 4 or main memory 5 in Fig. 1. Registers to store the information of the other characters are also properly provided in 
the MPU 4 or main memory 5 in Fig. 1 and are used as one storage memory on the program. The area of the register 
s is held in the RAM 5. The character No. of the first character can be obtained from LSQ) in the character line information 
storage area 42. Similarly, the character No. of the last character e of the character line j-1 before the character line j 
is stored into the register e. An area of the character e is held in the RAM 5. The character No. of the last character e 

fs of the preceding character line can be easily obtained because such a character Is located just before the character 
indicated by LS(j) in the character line information storage area 42. In step S8-5, if an interval between the character 
line j-1 and the character line ] is larger than the interval LS between the character lines of the body portion, it is 
determined that there is a different paragraph between the character lines j and j-1 , so that the processing routine 
advances to step S8-6. 

20 [0047] The discrimination in step S8-5 is practically executed as follows. 

(LPY(j) - (LPY(j - 1 ) + LPHQ - 1 )))/LS > m equation S8-5-1 

25 where, m is a constant such as 2. 

[0048] In step S8-6, the number of blank lines which are insertable therebetween is calculated and a calculated value 
is stored into a register n. A value of n is practically expressed by the following equation. 

30 n = (LPYO) - (LPYG - 1) + LPHG - 1) + 

LPHQ - 1 )))/(FH + LS) equation 88-6-1 

[0049] When n is 1 or more, the value of (n 1 ) is stored into LF(e) in steps S8-7 and S8-8. If n is smaller than 1 , 2 
35 is stored into LF(e). The value in LF(e) is equal to the number of return codes when the return codes are generated 

after that. In step S8-10, a check is made to see if there is a blank before the first character S of the character line j or 

not. Practically speaking, such a discrimination is performed by checking whether BL(s) in the character information 

storage area 41 is larger than 0 or not. If it is determined that a blank exists before the first character S, step S8-11 

follows and 1 is stored Into LF(e). 
40 [0050] In step S8-1 2, a check is made to see if there is a blank after the last character e of the character line j-1 or 

not. The blank is discriminated by a distance between the position of the last character e and the right edge of the 

document. Practically speaking, such a blank is discriminated as follows. 

^ RX - CPX(e) > 2WD(j - 1 ) equations S8-1 2-1 

Thus, it is possible to recognize such that the character lines 7-5, 7-9, and 7-1 0 in Fig. 7 are the ends of the paragraphs. 
[0051] By executing the above processes with respect to the character lines of the Nos. j (= 1 ) to NL-1 , the processing 
routine is finished. Thus, the delimiter of the paragraph can be recognized because a character in which a value larger 
so than 0 has been stored is a last character of the paragraph by referring to the value in LF(i) of each character. 



Claims 

55 1. An image processing apparatus comprising: 

input means (MPU 4) for inputting image data; and 

extraction means (MPU4) for extracting a plurality of pieces of character position infomnation from the image 
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data inputted by said input means; 

detection nneans (S5-4) for detecting a plurality of spaces between each adjacent two of the plurality of pieces 
of character position information extracted by said extraction means; 

5 characterized by: 

derivation means for deriving a mean pitch value (PT(J)) from said character position information detected by 
said detection means for each line (J) of input image data; 

means for calculating (S5-4) by which number of times each one of said detected plurality of spaces (CPX(i)- 
10 CPX(i-1 ) + CPW(i-1 )) is larger than said character pitch (PT(J)); 

means for setting (S5-4) the Integer number which is closest to said calculated number of times as the number 
of space codes (BL(i)) to be assigned to each one of said plurality of detected spaces; and 
output means (S5-6, S6-7) for outputting a number of space codes (BL(i)) equal to said integer number set 
by said setting means. 

15 

2. An apparatus according to claim 1 , wherein said input means comprises a scanner. 

3. An apparatus according to claim 1 , wherein the space codes comprise blank character codes. 
20 4. An apparatus according to claim 1 , wherein the space codes comprise return codes. 

5. An image processing method comprising the steps of: 

obtaining input image data from an image; 
25 extracting a plurality of pieces of character position information from the input image data; 

detecting (S5-4) a plurality of spaces between each adjacent two of the pieces of character position information; 
said method being characterized by the further steps of 

deriving a mean pitch value (PT(J))from said character position information for each line (J) of input image data; 
calculating (S5-4) by which number of times each one of said detected plurality of spaces (CPX(i)-CPX(i-l ) + 
30 CPW(i-1)) is larger than said character pitch (PT(J)); 

setting (S5-4) the integer number which is closest to said calculated number of times as the number of space 
codes (BL(i)) to be assigned to each one of said plurality of detected spaces; and 
outputting (S5-6, S6-7) said number of space codes (BL(i)) to represent that space. 

35 6. A method according to claim 5, in which the image data is obtained by scanning the image. 

7. A method according to any of claims 5 and 6. wherein the space codes comprise blank character codes. 

8. A method according to any of claims 5 and 6, wherein the space codes comprise return codes. 



Patentanspruche 

1. Bildverarbeitungsgerat, mit 

45 

einer Eingabeeinrichtung (MPU 4) zur Eingabe von Bilddaten, und 

einer Extraktionseinrichtung (MPU 4) zur Extraktion einer Vielzahl von Teilstucken von Zeichenpositionsinfor- 
mationen aus den durch die Eingabeeinrichtung eingegebenen Bilddaten, 

einer Erfassungseinrichtung (S5-4) zur Erfassung einer Vielzahl von Freiraumen zwischen jeweils zwei be- 
50 nachbarten Teilstucken der Vielzahl von Teilstucken der durch die Extraktionseinrichtung extrahierten Zei- 

chenpositionsinformattonen, 
gekennzeichnet durch 

eine Herleitungseinrichtung zur Herleitung eines Teilungsabstandsmittelwertes (PT(j)) fur jede Zeile Q) der 
Eingabebilddaten aus den durch die Erfassungseinrichtung erfaBten Zeichenpositlonsinformationen, 
55 eine Einrichtung zur Berechnung (S5-4) eines Faktors. der angibt, um wievielmal jeder der erfafBten Vielzahl 

der Freiraume (CPX(I) - CPX(i-1 ) + CPW (i-1 )) groBer als der Zeichenteilungsabstand (PT(j)) tst. 
eine Einrichtung zur Einstellung (S5-4) der dem berechneten Faktor am nachsten kommenden ganzen Zahl 
als die Anzahl von Freiraumkodes (BL(i)), die jedem der Vielzahl der erfaBten Freiraume zugewiesen ist, und 
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eine Ausgabeeinrichtung (S5-6, S6-7) zur Ausgabe einer Anzahl von Freiraumkodes (BL(i)). die gleich der 
durch die Einstelleinrichtung eingestellten ganzen Zahl ist. 

2. Gerat nach Anspruch 1 , wobei die Eingabeeinrichtung eine Abtasteinrichtung aufweist. 

5 

3. Gerat nach Anspruch 1, wobei die Freiraumkodes Leerzeichenkodes aufweisen. 

4. Gerat nach Anspruch 1 , wobei die Freiraumkodes Zeilenwechselkodes aufweisen. 
10 5. Bildverarbeitungsverfahren mil den Schritten: 

Erhalten von Eingabebitddaten eines Bildes, 

Extrahieren einer Vielzahl von Teilstucken von Zeichenpositionsinformationen aus den Eingabebilddaten, 
Erfassen (S5-4) einer Vielzahl von Freiraumen zwischen jeweils zwei benachbarten Teilstucken der Zeichen- 
15 positionsinformationen, 

gekennzeichnet durch folgende Schritte 

Herleiten eines Teilungsabstandsmittelwertes (PT(j)) fur jede Zeile (j) der Eingabebilddaten aus den Zeichen- 
positionsinformationen, 

Berechnen (S5-4) eines Faktors, der angibt, um wievielmal jeder der erfafBten Vielzahl der Freiraume (CPX 
20 (i) - GPX(i-l) + CPW (1-1 )) groRer als der Zeichenteilungsabstand (PT(J)) ist, 

Einstellen (S5-4) der dem berechneten Faktor am nachsten kommenden ganzen Zahl als die Anzahl von 
Freiraumkodes (BL(i)). die jedem der Vielzahl der erfaBten Freiraume zugewiesen ist, und 
Ausgeben (S5-6, S6-7) der Anzahl der Freiraumkodes (BL(i)) zur Darstellung des Freiraums. 

25 6. Verfahren nach Anspruch 5, wobei die Bilddaten durch Abtasten des Bildes erhalten werden. 

7. Verfahren nach einem der Anspruche 5 und 6, wobei die Freiraumkodes Leerzeichenkodes aufweisen. 

8. Verfahren nach einem der Anspruche 5 und 6, wobei die Freiraumkodes Zeilenwechselkodes aufweisen. 



Revendicattons 

1. Appareil de traitement d'image comprenant : 

35 

un moyen d'entrde (MPU 4) pour entrer des donn6es d'image ; et 

un moyen d'extraction (MPU 4) pour extraire une plurality d'6l6ments d'information de positions de caract6res 
^ partir des donn6es d'image entries par ledit moyen d'entr6e ; 

un moyen de detection (S5-4) pour d6tecter une plurality d'espaces entre chaque couple de deux 6l6ments 
40 adjacents parmi la plurality d'6l6ments d'information de position de caract6res extraits par ledit moyen 

d'extraction ; 
caract^rise par : 

un moyen de d6rivation pour ^laborer une valeur de pas moyenne (PT (5)) ^ partir desdites informations de 
position de caractSres d6tect6es par ledit moyen de d6tection pour chaque ligne (J) de donn6es d'image 
45 d'entr6e ; 

un moyen pour calcuJer (S5-4) de combien de fois chaque espace de ladite plurality d'espaces ddtectde (CPX 
(1)- CPX(i-l) + CPW (i-l)) est sup6rieur audit pas de caractere (PT(J)) ; 

un moyen pour fixer (S5-4) le nombre entier qui est le plus proche dudit nombre de fois calcul6 comme nombre 
de codes d'espace (BL(i)) ^ attribuer k chaque espace de ladite plurality d'espaces d6tect6e ; et 
so un moyen de sortie (S5-6, S6-7) pour d6livrer en sortie un nombre de codes d'espace (BL(i)) 6gal audit nombre 

entier fix6 par ledit moyen pour fixer. 

2. Appareil selon la revendication 1 , dans lequel ledit moyen d'entr6e comprend un scanneur. 

55 3. Appareil selon la revendication 1 , dans lequel les codes d'espace comprennent des codes de caractferes blancs. 
4. Appareil selon la revendication 1 , dans lequel les codes d'espace comprennent des codes de retour 
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Proc6d6 de traitement d" image comprenant les 6tapes consistant ^ : 
obtenir des donn6es d'image entries ^ partir d'une image ; 

extraire une plurality d'6l6ments d'information de position de caractferes ^ partir des donn6es d'image entries ; 
d6tecter (S5-4) une plurality d'espaces chaque fois entre deux 616ments adjacents parmi les 616ments d'in- 
formation de position de caract6res, ledit proc§d6 6tant caract6rts6 par les Stapes suppl6mentaires consistant 

k: 

^laborer une valeur de pas moyenne (PT(J)) ^ partir desdites informations de position de caract^res pour 
chaque ligne (J) de donn6es d'image entries ; 

calculer (S5-4) de combien de fois chaque espace de ladite plurality d'espaces d6tect6e (CPX(i)- CPX(I-I) + 
CPW (i-1)) est sup6rieur audit pas de caract^re {PT(J)) ; 

fixer (S5-4) le nombre entier qui est le plus proche dudit nombre de fois calculi comme nombre de codes 

d'espace (BL(i)) k attribuer k chaque espace de ladite plurality d'espaces d6tect6e; et 

d6llvrer en sortie (S5-6, S6-7) ledit nombre de codes d'espace (BL(i)) pour representor cet espace. 

Proc6de selon la revendication 5. dans lequel les donn6es d'image s'obtiennent par scannage de I'image. 

Proc6de selon I'une quelconque des revendications 5 et 6, dans lequel les codes d'espace comprennent des codes 
de caractdres blancs. 

Proc6d6 selon I'une quelconque des revendications 5 et 6, dans lequel les codes d'espace comprennent des codes 
de retour. 
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